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ABSTRACT: The importance of understanding SARS-CoV-2 
evolution cannot be overlooked. Recent studies confirm that 
natural selection is the dominating mechanism of SARS-CoV-2 
evolution, which favors mutations that strengthen viral infectivity. 
Here, we demonstrate that vaccine-breakthrough or antibody- 
resistant mutations provide a new mechanism of viral evolution. 
Specifically, vaccine-resistant mutation Y449S in the spike (S) 
protein receptor-binding domain, which occurred in co-mutations 
Y449S and NSO1Y, has reduced infectivity compared to that of the 
original SARS-CoV-2 but can disrupt existing antibodies that 
neutralize the virus. By tracking the evolutionary trajectories of 
vaccine-resistant mutations in more than 2.2 million SARS-CoV-2 
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genomes, we reveal that the occurrence and frequency of vaccine-resistant mutations correlate strongly with the vaccination rates in 
Europe and America. We anticipate that as a complementary transmission pathway, vaccine-breakthrough or antibody-resistant 
mutations, like those in Omicron, will become a dominating mechanism of SARS-CoV-2 evolution when most of the world’s 
population is either vaccinated or infected. Our study sheds light on SARS-CoV-2 evolution and transmission and enables the design 


of the next-generation mutation-proof vaccines and antibody drugs. 


S tarted in late 2019, the coronavirus disease 2019 (COVID- 
19) pandemic caused by severe acute respiratory syndrome 
coronavirus 2 (SARS-CoV-2) has had devastating impacts 
worldwide, plunging the world into an economic recession. 
Although several authorized vaccines have offered promise for 
controlling the disease in early 2021, the emergence of multiple 
variants of SARS-CoV-2 indicates that the combat with SARS- 
CoV-2 will be protracted. At this stage, almost all SARS-CoV-2 
vaccines and monoclonal antibodies (mAbs) are targeted at the 
spike (S) protein,’ while mutations on the S protein have been 
verified to compromise the efficacy of existing vaccines and 
mAbs.” * Therefore, it is imperative to understand the 
mechanisms of viral mutations, especially on the S gene of 
SARS-CoV-2, which will promote the development of mutation- 
proof vaccines and mAbs. 

The mechanism of mutagenesis is driven by various 
competitive processes,” which can be categorized into three 
different scales with many factors as illustrated in Figure la: (1) 
the molecular scale, (2) the organism scale, and (3) the 
population scale. From the molecular-scale perspective, the 
reading frame shifts, replication errors, transcription errors, 
translation errors, viral proofreading, and viral recombination 
are the main driven sources. Moreover, the host gene editing 
induced by the adaptive immune response” and the recombi- 
nation between the host and virus are the key-driven factors at 
the organism level. Finally, the natural selection popularized by 
Charles Darwin is a critical population-level process, which 
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favors mutations that have reproductive advantages for the virus 
to have adaptive traits in evolution. Such complicated 
mechanisms of viral mutagenesis make the comprehension of 
viral transmission and evolution a grand challenge. 

Although there are 28,912 unique single mutations dis- 
tributed widely on the whole SARS-CoV-2 genome, the 
mutations on the S gene stand out among all 29 genes on 
SARS-CoV-2 due to the mechanism of viral infection. With the 
assistance of host transmembrane protease, serine 2 
(TMPRSS2), SARS-CoV-2 enters the host cell by interacting 
with its S protein and the host angiotensin-converting enzyme 2 
(ACE2)'° (see Figure 1b). Later, antibodies will be generated by 
the host immune system, aiming to eliminate the invading virus 
through direct neutralization or non-neutralizing binding, ™"? 
which makes the S protein the main target for the current 
vaccines. Specifically, there is a short immunogenic fragment 
located on the S protein of SARS-CoV-2 that can facilitate the 
binding of SARS-CoV-2 S protein to ACE2, which is called the 
receptor-binding domain (RBD).™? Studies have shown that the 
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Figure 1. (a) Mechanism of mutagenesis. Nine mechanisms are grouped into three scales: (1) molecule-based mechanism (green), (2) organism- 
based mechanism (red), and (3) population-based mechanism (blue). The reading frame shifts (Shift), replication error (Rep), transcription error 
(Transcr), translation error (Trans), viral proofreading (Proof), and recombination (Recomb) are the six molecule-based mechanisms. Gene editing 
and host—virus recombination are the organism-based mechanism. In addition, the natural selection (Natural) is the population-based mechanism, 
which is the mainly driven source in the transmission of SARS-CoV-2. (b) Sketch of SARS-CoV-2 and its interaction with a host cell. (c) Illustration of 
30 single-site RBD mutations with the top frequencies. The height of each bar shows the BFE change of each mutation; the color of each bar represents 
the natural log of the frequency of each mutation, and the number at the top of each bar means the Al-predicted number of antibody and RBD 
complexes that may be significantly disrupted by a single-site mutation. (d) Illustration of SARS-CoV-2 S protein with human ACE2. The blue chain 
represents the human ACE2; the pink chain represents the S protein, and the purple fragment on the S protein points out the two vaccine-resistant 


mutations Y449S and Y449H. 





binding free energy (BFE) between the S RBD and the ACE2 is 
proportional to the infectivity.'°'*~'” Therefore, tracking and 
monitoring the RBD mutations and their corresponding BFE 
changes will expedite the understanding of the infectivity, 
transmission, and evolution of SARS-CoV-2, especially for the 
new SARS-CoV-2 variants, such as Alpha, Beta, Gamma, Delta, 
Lambda, etc.’ Specifically, a positive BFE change between S 
and ACE2 induced by the mutation of a given variant indicates 
an infectivity-strengthened capacity, while a negative BFE 
change between S and ACE2 suggests an infectivity-weakened 
variant. 

The current prevailing variants Alpha, Beta, Gamma, Delta, 
Kappa, Theta, Lambda, Mu, and Omicron carry at least one vital 
mutation at residues 452 and 501 on the S protein RBD. 
Notably, in early 2020, we successfully predicted that residues 
452 and 501 “have high changes to mutate into significantly 
more infectious COVID-19 strains”.’” In the same work, we 
hypothesized that “natural selection favors those mutations that 
enhance the viral transmission” and provided the first evidence 
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for infectivity-based natural selection. In other words, we 
revealed the mechanism of SARS-CoV-2 evolution and 
transmission based on very limited genome data in June 
2020.'° Additionally, we predicted three categories of RBD 
mutations: (1) most likely (1149 mutations), (2) likely (1912 
mutations), and (3) unlikely (625 mutations).'’ To date, almost 
all of the RBD mutations we detected fall into our first 
category.” Moreover, all of the top 100 most observed RBD 
mutations have a BFE change greater than the average BFE 
changes of —0.28 kcal/mol (the average BFE changes for all 
RBD mutations~’). It is an extremely low odd [i.e., 1/(1.27 x 
10°°)] for 100 RBD mutations to accidentally have BFE changes 
simultaneously above the average value, which confirms our 
hypothesis that the transmission and evolution of new SARS- 
CoV-2 variants are governed by infectivity-based natural 
selection, despite all other competing mechanisms.'” Our 
predictions rely on algebraic topology” ~*-assisted deep 
learning™™” but have been extensively validated.** However, 
infectivity is not the only transmission pathway that governs viral 
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Figure 2. Most significant RBD mutations. (a) Time evolution of RBD mutations with its mutation-induced BFE changes per 60 days from April 30, 
2020, to October 22, 2021. Here, only the top 100 most observed RBD mutations are displayed. Each bar represents a RBD single mutation. The height 
and color of each bar represent the log frequency and ACE-S BFE change induced by a given RBD mutation. The red star marks the vaccine-resistant 
mutations with significantly negative BFE changes. (b) Time evolution of RBD mutations with its experimental mutation-induced log? enrichment 
ratio changes per 60 days from April 30, 2020, to October 22, 2021. The height and color of each bar represent the log frequency and enrichment ratio 
change induced by a given RBD mutation. The red star marks vaccine-resistant mutations with significantly negative BFE changes. 





evolution. Vaccine-resistant mutations or, more precisely, 
antibody-resistant mutations that can disrupt the protection of 
antibodies have become a viable mechanism for new variants to 
transmit among the vaccinated population since the vaccine was 
put on the market. In early January 2021, we have predicted that 
RBD mutations W353R, I401N, Y449D, Y449S, P491R, P491L, 
Q493P, etc., will weaken the binding of most antibodies to the S 
protein.’ Later, we provided a list of most likely vaccine escape 
RBD mutations with high frequency, including $494P, Q493L, 
K417N, F490S, F486L, R403K, E484K, L452R, K417T, F490L, 
E484Q, and A475S.”° Moreover, we have pointed out that 
Y449S and Y449H are two vaccine-resistant mutations, and 
“Y449S, S494P, K417N, F490S, L452R, E484K, K417T, E484Q, 
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L452Q, and NSO1Y” are the top 10 mutations that will disrupt 
most antibodies with high frequency.” As mentioned in ref 26, 
RBD mutations such as E484K/A, Y489H, Q493K, and NSO1Y 
found in late-stage evolved S variants “confer resistance to a 
common class of SARS-CoV-2 neutralizing antibodies”, which 
suggests the viral evolution is also regulated by vaccine-resistant 
mutations. Interestingly, experimental results confirm that Y449, 
L455, F456, E484, F486, N487, Y489, Q493, $494, and YS05 are 
important for antibody binding, which means that mutations on 
these residues may enable the virus to escape antibodies.” 
Notably, the most common mode of binding between 
antibodies and S protein is through hydrophobic contacts, and 
Y449 is located at the receptor-binding motif with hydrophobic 
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Figure 3. RBD co-mutation analysis. (a) Time evolutionary trajectory of two RBD co-mutations with its mutation-induced BFE changes per 30 days 
from January 25, 2021, to October 22, 2021. Each bar represents a pair of RBD co-mutations. The height and color of each bar represent the log 
frequency and ACE-S BFE change induced by a given RBD mutation. Red stars mark the two co-mutations with significantly negative BFE changes. (b) 
Time evolutionary trajectory of three RBD co-mutations with its mutation-induced BFE changes per 30 days from June 24, 2021, to October 22, 2021. 
Each bar represents a RBD co-mutation. The height and color of each bar represent the log frequency and ACE-S BFE change induced by a given RBD 
mutation. (c) Time evolutionary trajectory of four RBD co-mutations with its mutation-induced BFE changes per 30 days from June 24, 2021, to 
October 22, 2021. Each bar represents a RBD co-mutation. The height and color of each bar represent the log frequency and ACE-S BFE change 
induced by a given RBD mutation. (d) Illustration of the top 50 most observed RBD co-mutations. Here, the length of each bar represents the total 
ACE2-S BFE changes induced by a specific RBD co-mutation; the color of each bar represents the natural log frequency of each co-mutation, and the 
number at the side of each bar is the Al-predicted antibody disruption count. 





side chains, indicating it is one of the vital residues for the 
binding between antibodies and S protein.*”** 

The objective of this work is to analyze the evolution of the 
mechanisms of SARS-CoV-2 evolution, driven by complemen- 
tary viral transmission pathways. We demonstrate how the 


interplay among molecular-scale, organism-scale, and popula- 


tion-scale mechanisms of SARS-CoV-2 mutations has affected 
the evolution of SARS-CoV-2. As a primary driven source of 
mutagenesis, the molecule-based mechanisms such as reading 
frame shifts, proofreading, etc., change the genetic information 
initially. Next, gene editing takes charge of the organism-based 
mechanism, suggesting the immune response of the host to the 
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virus. Then, the population-level mechanism governs the 
transmission pathways of viral evolution. Two complementary 
pathways (infectivity and vaccine resistance) regulated by 
natural selection become the preponderance of evolution-driven 
force. The RBD mutations regulated by infectivity-based 
pathways exist in the prevailing variants, while the mutations 
regulated by the vaccine-resistant pathway start to emerge in 
countries with relatively high vaccination rates. In this work, 
2,298,349 complete SARS-CoV-2 genomes isolated from 
patients are decoded by single-nucleotide polymorphism 
(SNP) calling, from where a total of 28,912 unique single 
mutations are detected. Among them, 774 RBD mutations were 
discovered by October 20, 2021 (detailed information can be 
found in section S6 of the Supporting Information). On the basis 
of our comprehensive topology-based artificial intelligence (AI) 
model for predicting RBD mutation-induced BFE changes of 
RBD and ACE2/antibody complexes,’'” the transmission 
trajectory of vaccine-resistant RBD mutations will be analyzed 
(detailed information about the methods and model can be 
found in sections S1 and S2 of the Supporting Information). 
Moreover, vaccine-resistant RBD mutation Y449S that has been 
found in more than 1000 isolates is discussed. Furthermore, the 
vaccination rates of 12 countries where Y449S is distributed are 
also analyzed, which provides a sound explanation of the relation 
between the emergence of vaccine-resistant mutations and the 
vaccination rate. Such an understanding of two complementary 
transmission pathways will shed light on the long-term efficacy 
of S-targeted antibody countermeasures and benefit the 
development of next-generation mutation-proof vaccines and 
mAbs. 

Studying the mechanisms of SARS-CoV-2 mutagenesis is 
beneficial to the understanding of viral transmission and 
evolution. The main driving force of viral evolution is regulated 
by natural selection, which is employed by two complementary 
transmission pathways: (1) infectivity-based pathway and (2) 
vaccine-resistant pathway. We have discussed the infectivity- 
based pathways in refs 21 and 29. This section focuses on the 
vaccine-resistant pathway and its impact on the transmission and 
evolution of SARS-CoV-2. To understand the mechanisms of 
vaccine-resistant mutations, we first analyze 2,298,349 complete 
SARS-CoV-2 genomes, and a total of 28,912 unique single 
mutations are decoded. Among them, there are 774 non- 
degenerate RBD mutations. The infectivity of SARS-CoV-2 is 
proportional to the BFE between the S RBD and ACE2.'°'*~"” 
Therefore, the BFE change induced by a specific RBD mutation 
reveals whether the RBD mutation is an infectivity-strengthen- 
ing mutation or an infectivity-weakening one. Similarly, the BFE 
change between the S RBD and antibody induced by a given 
mutation reveals whether this mutation will strengthen the 
binding between S and the antibody. To date, we have collected 
130 antibody structures (see section S6 of the Supporting 
Information), which includes Food and Drug Administration 
(FDA)-approved mAbs from Eli Lilly and Regeneron. For a 
specific RBD mutation, its antibody disruption count shows the 
number of antibodies that have antibody-S BFE changes of less 
than —0.3 kcal/mol. The ACE2-S and antibody-S BFE changes 
induced by RBD mutations are predicted from our TopNetTree 
model,” which is available at TopNetmAb. All of the predicted 
BFE changes induced by RBD mutations can be found at 
Mutation Analyzer (https://weilab.math.msu.edu/ 
MutationAnalyzer/). Figure lc illustrates the top 30 most 
observed RBD mutations. The height and color of each bar 
represent the ACE2-S BFE changes and the frequency of each 


RBD mutation. The number at the top of each bar shows the 
antibody disruption count of each mutation. The detailed 
information can be viewed in section S4 of the Supporting 
Information. One can see that 27 mutations have positive ACE2- 
S BFE changes, suggesting they are regulated by the infectivity- 
based transmission pathway. However, three RBD mutations 
(S4771, D427N, and Y449S) have negative BFE changes. 
Notably, the Y449S mutation has a significantly negative BFE 
change (—0.8112 kcal/mol) and a large antibody disruption 
count (85), revealing an atypical mechanism of mutagenesis. 
Such a mutation with a significantly negative ACE2-S BFE 
change together with a high antibody disruption count is called a 
vaccine-resistant or antibody-resistant mutation. Figure 1d is the 
illustration of SARS-CoV-2 S protein (pink color) with human 
ACE2 (blue color), and the Y449 residue (purple color) is 
located on the random coil of the S protein. Among all of the 
vaccine-resistant mutations, the Y449S mutation has the highest 
frequency (1193). In addition, at residue 449, mutations Y449H, 
Y449N, and Y449D are all vaccine-resistant mutations that have 
been observed in more than 20 SARS-CoV-2 genome isolates. 

To track the evolution trajectory of vaccine-resistant 
mutations, the BFE changes, log 2 enrichment ratios,“ and log 
10 frequencies of RBD mutations are analyzed from April 30, 
2020, to October 22, 2021, per 60 days, as illustrated in Figure 2. 
Here, the top 100 most observed RBD mutations are displayed. 
In Figure 2a, red stars mark the vaccine-resistant mutations that 
have negative BFE changes. A few vaccine-resistant mutations 
(S438F, 1434K, YSOSC, and Q506K) were detected before 
November 2020 with relatively low frequencies. Notably, since 
December 2020, such vaccine-resistant mutations were no 
longer in the list of the top 100 most observed RBD mutations, 
suggesting that in this period, the evolution of SARS-CoV-2 is 
mainly regulated by natural selection through the infectivity- 
based transmission pathway. Moreover, in May 2021, two 
vaccine-resistant mutations (Y449S and Y449H) came back to 
the top 100 most observed RBD mutation list. In addition, the 
Y449S mutation has a relatively high frequency. This finding 
indicates that natural selection favors not only those mutations 
that enhance the transmission but also those mutations that can 
disrupt plenty of antibodies since SARS-CoV-2 vaccination was 
administered to provide protection among populations in early 
May. Similarly, the patterns can be found in Figure 2b, 
suggesting our Al-predicted BFE changes are highly consistent 
with the deep mutational enrichment ratio from experiments.” 

The vaccine-resistant mutations are usually found along with 
other RBD mutations. Therefore, analyzing the time evolution 
of RBD co-mutations offers a better understanding of the 
mechanisms of vaccine-resistant mutations. Panels a—c of Figure 
3 illustrate the time evolution of two, three, and four RBD co- 
mutations, respectively, with their corresponding BFE changes 
every 30 days. Here, each bar represents a RBD co-mutation, 
and the height and color of each bar represent the log 10 
frequency and total BFE change induced by a given RBD co- 
mutation, respectively. Considering the number of co-mutations 
is quite low in the year 2020, the time range of analysis is set to 
[January 25, 2021, October 22, 2021] for the time evolution 
analysis of two co-mutations. For three and four co-mutations, 
their time ranges are set to [June 24, 2021, October 22, 2021]. In 
Figure 3a, a red star marks the two co-mutations with 
significantly negative BFE changes. 

At the end of March 2021, vaccine-resistant mutation Y449D 
showed up with mutation NSO1Y in some genome isolates, 
resulting in a negative BFE change (—0.473 kcal/mol) and a 
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Figure 4. (a) Distribution of vaccine-resistant mutation Y449S. The color bar represents the log 10 frequency of Y449S in 14 countries: Denmark 
(DK), the United Kingdom (UK), France (FR), Bulgaria (BG), the United States (US), Argentina (AR), Brazil (BR), Sweden (SE), Canada (CA), 
Switzerland (CH), Germany (DE), Spain (ES), Romania (RO), and Belgium (BE). The number located at the side of the country shows the total 
number of positive SARS-CoV-2 cases by October 22. (b) Time evolution of the vaccination rate and the frequency of Y449S in top 12 countries from 
December 26, 2020, to October 22, 2021. The data are collected per 30 days. The red line shows the frequency of mutation Y449S. The orange and 
purple areas represent the rate of at least one dose and the rate of full vaccination, respectively, in each country. 





high antibody disruption count (98) for a pair of RBD co- 
mutations (Y449D and NSO1Y). However, its global frequency 
is relatively low. Since late April 2021, vaccine-resistant 
mutation Y449S showed up with NSO1Y, making RBD co- 
mutations Y449S and NSO1Y some of the most prevament 
vaccine-resistant co-mutations. Figure 3d shows the top 50 most 
observed RBD co-mutations; the length and color of each bar 
represent the total BFE change and the natural log of frequency 
of an RBD co-mutation, respectively. The number at the side of 
each bar is the count of antibody disruption. Among the 50 most 
observed RBD co-mutations, the Y449S and NSO1Y co- 
mutation is the only co-mutation with a significantly negative 
BFE change and an extremely high antibody disruption count 
(94). Observing the evolution trajectory of Y449S and NSO1Y 
shows that the infectivity transmission pathway regulated by 
natural selection in the population level was the major evolution- 
driving force of SARS-CoV-2 mutagenesis before March 2021. 
Starting in January 2021, several vaccines were authorized for 
emergent use. Two months later, because many people had been 
protected by the vaccines, the mutations that disrupted the 
binding between the S and antibodies could be transmitted 
among vaccinated people, especially in countries with high 
vaccination rates. Such a vaccine-resistant pathway reduces the 
efficacy of vaccines and antibody therapies, indicating the 
combat with COVID-19 will be a prolonged battle. 
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Similar time evolution trajectories are drawn for three and 
four RBD co-mutations (see panels b and c, respectively, of 
Figure 3). There are no triple vaccine-resistant co-mutations at 
present, while a quadruple vaccine-resistant co-mutation 
(K417T, Y449S, E484K, and NSOLY) appeared after late 
August 2021. Notably, Gamma variants, one of the variants of 
concern (VOC), carry three co-mutations (K417T, E484K, and 
NSO1Y) on the RBD, which indicates that four vaccine-resistant 
co-mutations (K417T, Y449S, E484K, and NSOLY) may be a 
potential threat in the future. 

Analysis of the vaccination trends and vaccine-resistant 
mutations leads to a fundamental understanding of the 
transmission and evolution of vaccine-resistant mutations. We 
investigate the distribution and time evolution of vaccine- 
resistant RBD mutation Y449 in 14 countries. As the most 
observed vaccine-resistant RBD mutation, Y449S has been 
detected in 14 countries, including Denmark (DK), the United 
Kingdom (UK), France (FR), Bulgaria (BG), the United States 
(US), Argentina (AR), Brazil (BR), Sweden (SE), Canada 
(CA), Switzerland (CH), Germany (DE), Spain (ES), Romania 
(RO), and Belgium (BE), as illustrated in Figure 4a. Here, 14 
countries in which Y449S was found are colored blue. The 
darker the blue, the higher the frequency of Y449S. The number 
on the side of each country is the total positive cases up to 
October 22, 2021. Although DK has the smallest number of 
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positive cases among 14 countries, the frequency of the Y449S 
mutation is the highest. More than 800 patients carry vaccine- 
resistant mutation Y449S in DK. All of the Y449S-related cases 
are found in Europe and America, where the vaccination rates in 
those areas are relatively high. Figure 4b shows the time 
evolution of the vaccination ratio and the frequency of Y449S in 
the top 12 countries as mentioned above in 30-day periods. An 
illustration of CH and RO can be found in section SS of the 
Supporting Information. The x-axis records the date, which 
ranges from December 26, 2020, to October 22, 2021. The left- 
hand side y-axis shows the frequency of Y499S (red lines), and 
the right-hand side y-axis shows the vaccination ratio. In 
addition, the orange region shows at least one dose ratio, while 
the purple region means the fully vaccinated ratio. One can see 
that the Y449S mutation was first found in BG and the US in 
December 2020. However, the frequency of the Y449S mutation 
in BG and the US is quite low before April 2021. After April 
2021, the Y449S mutation quickly spread to 10 other countries. 
Among them, the total number of cases related to Y449S has a 
tendency to increase rapidly, especially in DK, the UK, and FR. 
Notably, all three countries have relatively high vaccination 
ratios (>70% up to late October 2021). It is worth mentioning 
that the frequency of the Y449S mutation is low in DE, ES, BE, 
etc., which is mainly due to the first Y449-related case in these 
countries being detected after June 2021. Since then, Delta 
variants dominated among the prevailing variants, which gave 
the Y449S mutation a limited chance to spread rapidly. 
Moreover, from Figure 4, one can see that the frequency of 
the Y449S mutation has a tendency to increase similar to that of 
the fully vaccinated ratio, suggesting that the vaccine-resistant 
mutations will gradually become one of the main evolution- 
driving forces of SARS-CoV-2, especially in those areas with 
high vaccination rates. 

Due to the appearance of multiple mutations known to reduce 
the efficacy of antibody neutralization generated by vaccines, it is 
vital to better comprehend the mechanisms of SARS-CoV-2 
mutagenesis, which will be of paramount importance to the 
understanding of SARS-CoV-2 transmission and evolution. The 
driving forces of mutagenesis can be categorized into three 
groups: (1) molecular-scale mechanisms, (2) organism-scale 
mechanisms, and (3) population-level mechanisms. As an initial 
driving force of mutagenesis, the genetic information is changed 
by reading frame shifts, viral proofreading, etc., which all belong 
to the group of molecular-scale mechanisms. Also, regulated by 
the host immune system, host gene editing and rarely occurring 
host—viral recombination are two organism-scale mechanisms. 
The molecular- and organism-scale mechanisms provide a large 
number of candidate mutations in the SARS-CoV-2 genome, 
while it is the population-scale mechanisms that determine what 
mutations become dominant. 

Natural selection is a population-scale mechanism, which 
promotes the surge of the emerging SARS-CoV-2 variants by 
two complementary pathways: infectivity and vaccine resistance. 
The early stage of SARS-CoV-2 evolution was entirely 
dominated by infectivity-strengthening mutations. However, 
since late March 2021, once vaccines had provided protection to 
highly vaccinated populations, several vaccine-resistant muta- 
tions such as Y449S and Y449H have been observed relatively 
frequently. Considering that a good portion of the population is 
still not vaccinated, infectivity-strengthening mutations still 
dominate among the prevailing and future variants. However, 
vaccine-breakthrough or antibody-resistant mutations, like 
many RBD mutations associated with the Omicron variant, 


will become a major mechanism of transmission once most of 
the populations are carrying antibodies through either 
vaccination or infection. Our studies are valuable to the 
development of the next-generation vaccines and mAbs, which 
are greatly important for the long-term combat with SARS-CoV- 
2. 
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